dCDH by_path Wave 5 #11: + heterogeneity (predict_het per-by_level)#412
dCDH by_path Wave 5 #11: + heterogeneity (predict_het per-by_level)#412igerber wants to merge 3 commits into
Conversation
PR Review ReportOverall AssessmentExecutive Summary
MethodologyFinding M-1: P3 Informational — documented partial/deviation surface, no action requiredSeverity: P3 Code QualityFinding CQ-1: P1 —
|
Extends `_render_path_effects_section` in chaisemartin_dhaultfoeuille_results.py to render a per-path "Heterogeneity Test (Section 1.5, partial)" sub-block when `path_heterogeneity_effects` is populated for the path. Mirrors the global `_render_heterogeneity_section` block scoped to each per-path section, alongside the existing per-path placebo / cumulated / sup-t blocks. Adds anti-regression test `TestByPathHeterogeneity::test_per_path_heterogeneity_renders_in_summary` that asserts both the section header and at least one rendered beta value appear in `summary()` output when `by_path + heterogeneity` is fit. Locks the schema↔renderer mirror so future per-path surfaces don't silently omit from `summary()`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment✅ Looks good — no unmitigated P0 or P1 findings in this re-review. Executive Summary
MethodologyM-1: P3 Informational — documented partial R
|
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment✅ Looks good — no unmitigated P0/P1 findings in this re-review. Executive Summary
MethodologyM-1: P3 Informational — documented partial R
|
Lifts the gate at chaisemartin_dhaultfoeuille.py:1230-1234 so per-path
event-study disaggregation composes with heterogeneity="<col>" (Web
Appendix Section 1.5, Lemma 7), mirroring R did_multiplegt_dyn(...,
by_path, predict_het) per-by_level dispatch.
Per-path heterogeneity is computed by re-running the Lemma 7 regression
on each path-restricted switcher subsample. New `path_groups`
(Optional[Set[int]]) parameter on _compute_heterogeneity_test restricts
eligibility to switchers ON path p; the variance machinery (standard
WLS vcov for non-survey, cell-period IF allocator for Binder TSL,
group-level allocator for Rao-Wu replicate) is unchanged from the
global heterogeneity path. Cohort dummies absorb baseline by
construction, so multi-baseline switcher panels do not produce
R-divergence (no parallel UserWarning like controls / trends_linear).
Surfaces on results.path_heterogeneity_effects keyed
{path: {l: {beta, se, t_stat, p_value, conf_int, n_obs}}} and on
to_dataframe(level="by_path") via new always-present het_* columns,
populated for positive horizons and NaN otherwise (mirrors cband_* /
cumulated_* convention). Per-(path, horizon) inference is refreshed
in the final R2 P1b block so all surfaces use the same df_survey
after replicate-weight n_valid appends.
R parity: introduces the FIRST predict_het R-parity baseline in the
repo. Two new scenarios (multi_path_reversible_predict_het global
anchor + multi_path_reversible_by_path_predict_het per-path) use
dont_drop_larger_lower=TRUE to match drop_larger_lower=False and
provide cohort variation under reversal paths. Per-path beta and SE
match R within rtol=1e-6.
Multiplier bootstrap (n_bootstrap > 0) under by_path + heterogeneity
+ survey_design inherits the existing per-path multiplier-bootstrap
gate from PR #408.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends `_render_path_effects_section` in chaisemartin_dhaultfoeuille_results.py to render a per-path "Heterogeneity Test (Section 1.5, partial)" sub-block when `path_heterogeneity_effects` is populated for the path. Mirrors the global `_render_heterogeneity_section` block scoped to each per-path section, alongside the existing per-path placebo / cumulated / sup-t blocks. Adds anti-regression test `TestByPathHeterogeneity::test_per_path_heterogeneity_renders_in_summary` that asserts both the section header and at least one rendered beta value appear in `summary()` output when `by_path + heterogeneity` is fit. Locks the schema↔renderer mirror so future per-path surfaces don't silently omit from `summary()`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds 3 slow regression tests for `by_path + heterogeneity + survey_design` composition that the original PR claimed via REGISTRY/CHANGELOG but lacked automated coverage: 1. `test_per_path_heterogeneity_under_survey_finite` — analytical Binder TSL SE finite per (path, l) under PSU=group survey design. 2. `test_per_path_heterogeneity_replicate_weights_propagates_n_valid` — BRR replicate-weight fit drives `_replicate_n_valid_list` through per-(path, l) heterogeneity calls and final `survey_metadata.df_survey == n_replicates - 1`. 3. `test_survey_design_plus_n_bootstrap_with_heterogeneity_still_raises` — confirms heterogeneity composition does not accidentally re-route around the existing per-path multiplier-bootstrap-survey gate. Also extends `to_dataframe(level="by_path")` docstring at `chaisemartin_dhaultfoeuille_results.py:1527` to list the new `het_*` columns alongside `cband_*` and `cumulated_*`, mirroring the always-present NaN-when-None convention. The empty-DataFrame schema already includes them; this is a docstring-only sync. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6c8a68c to
a57f583
Compare
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall AssessmentExecutive Summary
MethodologyM-1 — P3 Informational: documented partial heterogeneity implementationSeverity: P3 M-2 — No unmitigated methodology mismatch found in implementationSeverity: N/A Code QualityNo P0/P1/P2 code-quality findings in the visible changed source. The new field is threaded through:
PerformanceNo findings. The path heterogeneity helper re-enumerates treatment paths after MaintainabilityNo unmitigated maintainability findings. The schema shape Tech DebtNo untracked blocking tech debt found in the visible context. SecurityNo findings in the visible diff. I did not run a secret scanner or shell grep; this is based only on the loaded prompt content. Documentation/TestsDT-1 — P2: claimed multi-baseline behavior is not actually testedSeverity: P2 This is a claim-vs-test mismatch under the single-pass audit rules. No TODO mitigation is visible in the prompt, and shipped-behavior test gaps are P2. Concrete fix: Add or correct a test with at least two switcher baseline values, e.g.:
Path to Approval
|
Summary
chaisemartin_dhaultfoeuille.py:1230-1234so per-path event-study disaggregation composes withheterogeneity="<col>"._compute_path_heterogeneity_testhelper +path_groups: Optional[Set[int]]filter on_compute_heterogeneity_test. Per-path beta / SE / inference computed on the path-restricted switcher subsample; cohort dummies absorb baseline by construction (no R-divergence warning needed).path_heterogeneity_effects: Dict[Tuple[int, ...], Dict[int, Dict[str, Any]]]field;to_dataframe(level="by_path")extended with always-presenthet_*columns (NaN for placebo rows / when not requested).predict_hetparity baseline in the repo. Two new R generator scenarios (multi_path_reversible_predict_hetglobal anchor +multi_path_reversible_by_path_predict_hetper-path) usingdont_drop_larger_lower=TRUEto match Python'sdrop_larger_lower=Falserequirement under reversal paths. Per-path beta / SE match R withinrtol=1e-6.feedback_late_if_site_inference_refresh.md) so all per-path entries see the finaldf_surveyafter replicate-weightn_validappends.n_bootstrap > 0) underby_path + heterogeneity + survey_designinherits the existing per-path multiplier-bootstrap-survey gate from PR Compose by_path / paths_of_interest with survey_design (Wave 4 #10) #408.Methodology references (required if estimator / math changes)
did_multiplegt_dyn(..., by_path, predict_het)per-by_level dispatch (R/R/did_multiplegt_dyn.R:226-257)path_effectsSE is unchanged and does not apply here (heterogeneity uses standard WLS coefficient IF, not the cohort-recentered IF allocator). R'spredict_hetdont_drop_larger_lower=TRUEflag is matched in fixture scenarios so reversal paths preserve cohort variation under heterogeneity testing.Validation
tests/test_chaisemartin_dhaultfoeuille.py::TestByPathHeterogeneity— 12 tests across gate dispatch (5), behavior (4 incl. single-path telescope atatol=rtol=1e-14, zero-signal anti-regression, multi-baselineUserWarninganti-regression), DataFrame integration (1), edge cases (2)tests/test_chaisemartin_dhaultfoeuille_parity.py::TestDCDHDynRParityHeterogeneity— global anchor (FIRSTpredict_hetparity baseline)tests/test_chaisemartin_dhaultfoeuille_parity.py::TestDCDHDynRParityByPathHeterogeneity— per-pathbenchmarks/R/generate_dcdh_dynr_test_values.R— 2 new scenarios (Add comprehensive test coverage for utils module #20 + Update roadmap with current implementation limitations #21), 2 new extraction helpers (extract_dcdh_predict_het,extract_dcdh_by_path_predict_het)benchmarks/data/dcdh_dynr_golden_values.jsonregenerated (21 scenarios, 619 KB)Security / privacy
Generated with Claude Code